Adaptive gain control based on echo canceller performance information

ABSTRACT

A system and method for provide a stable gain from an adaptive gain control device in a signal path. An echo canceller is also located in the signal path, and is used to provide performance information regarding losses in the signal. This performance information is fed to the automatic gain control device via a connection. The automatic gain control device thereafter uses the performance information to determine a maximum gain that might be provided based upon losses cause by echo conditions. The gain however is limited in order to provide for a stable system. The performance information includes a loss rate that includes a combination of the echo return loss and the echo return loss enhancement.

CROSS-REFERENCE TO RELATED APPLICATIONS

The present application is a continuation of U.S. patent applicationSer. No. 11/621,327, filed Jan. 9, 2007, which is a continuation of U.S.patent application Ser. No. 10/077,133, filed Feb. 15, 2002, which is acontinuation-in-part of U.S. patent application Ser. No. 09/522,185,filed Mar. 9, 2000, which is a continuation-in-part of application Ser.No. 09/493,458, filed Jan. 28, 2000, which is a continuation-in-part ofapplication Ser. No. 09/454,219, filed Dec. 9, 1999, priority of eachapplication which is hereby claimed under 35 U.S.C. §120. All theseapplications are expressly incorporated herein by reference as thoughset forth in full.

FIELD OF THE INVENTION

The present invention relates generally to telecommunications systems,and more particularly, to a system and method for using informationderived from an echo canceller to provide automatic gain control.

BACKGROUND OF THE INVENTION

Telephony devices, such as telephones, analog fax machines, and datamodems, have traditionally utilized circuit-switched networks tocommunicate. With the current state of technology, it is desirable fortelephony devices to communicate over the Internet, or otherpacket-based networks. Heretofore, an integrated system for interfacingvarious telephony devices over packet-based networks has been difficultdue to the different modulation schemes of the telephony devices.Accordingly, it would be advantageous to have an efficient and robustintegrated system for the exchange of voice, fax data and modem databetween telephony devices and packet-based networks.

Such devices usually include some form of automatic gain control tocompensate for losses when a signal progresses through the system.Telephony devices might typically include at least one echo cancelingdevice to compensate for the echo return loss (ERL) that might occur intransmitting a signal from a near end to a far end, and vice versa. Thisecho canceling device can be used to provide a wealth of informationregarding the processed signal. Accordingly, what is needed in the fieldis a gain control device that uses certain information provided by theecho canceling device in order to provide a more stable automatic gaincontrol.

SUMMARY OF THE INVENTION

As signal losses are incurred in a particular system, gain controldevices can be applied to boost the levels of the signal. Generally, itis better to have a stronger signal for use within the system. Hence itis desirable to apply as much gain as possible to a signal, but withinthe constraints of the system. If too much gain is applied, theninstabilities in the system might result.

Automatic gain control (AGC) devices provide an automatic adjustment tothe gain of a signal based upon such factors as the incoming signallevel, and the like. However, when used with any type of system, anautomatic gain control (AGC) device can become unstable if the signalconditions require an excessive gain to be provided. In particular, ifthe hybrid (or model) of the echo return loss of a system is poor, andthe AGC is applying a considerable gain, then the overall loop gain canbe greater than unity. A system under such constraints may prove to beunstable. Hence, in order to limit the loop gain, most AGCs have fixedlimits on the amount of gain that can be applied (i.e., see ITU-Tstandard G.168 and G.169, each of which are hereby incorporated byreference).

Systems often include an embedded echo canceller device, or the like.This device might provide information about the signal, such as the echoreturn loss (ERL) and echo return loss enhancement (ERLE). Moreover,limits on the ERL and ERLE might also be known, as derived from thesystem, and also the echo canceller device. A combined loss estimate,i.e. ERL+ERLE can be derived from the echo canceller to showapproximately how much signal loss is actually imposed by the system. Ifthe combined loss estimate derived by the echo canceller is high, thenthe amount of gain to be applied by the AGC can be increased whileguaranteeing overall stability of the system.

In prior systems, the AGC and echo canceller are independent components.Accordingly, the maximum AGC gain is based upon the average performanceexpected from the echo canceller, or a very low maximum gain is used toinsure a stable system. The present system combines the informationderived from these devices, wherein the AGC can thereby provide arelatively higher gain, yet still provide for a stable system.

Any variety of techniques might be used to determine the combined loss,and thereafter the maximum AGC gain. One technique herein involvesdetermining the ERL and ERLE, and then combining them to find a combinedloss (Acom). Each of the ERL and ERLE are computed from a representativeset of power estimators. The gain is thereafter set equal to Acomadjusted down by a certain offset. If this adjusted gain is determinedto be more than a set maximum gain, then the lesser of the two is usedfor the maximum AGC gain.

One aspect of the present invention is directed to a system forproviding a gain to be generated by a gain control device located in atleast one signal path of the system. The system includes an echocanceller and a gain control device in the signal path. There is atleast one connection between the echo canceller device and the gaincontrol device. Information pertaining to the signal is provided fromthe echo canceller device to the gain control device so that the gaincan be maximized in light of the information.

According to another aspect of the present invention, a method isprovided which provides a gain to be generated by a gain control devicelocated in at least one signal path of the system. According to themethod, a signal is received with an echo canceller device in the signalpath and echo canceller performance information is generated. Theperformance information is sent to a gain control device in the signalpath. The performance information is used to generate a gain that ismaximized in light of the information.

According to another aspect of the present invention, a method isprovided which generates an echo return loss (ERL) estimate for acommunication signal. Pursuant to the method, an ERL value and an ERLcvalue are determined. The ERL estimate is then denoted as a function ofthe ERL value and the ERLc value.

According to yet another aspect of the present invention, a method isprovided which generates an echo return loss enhancement (ERLE) estimatefor a communication signal. According to the method, a first long termERLE value ERLElt and a second long term ERLE value ERLE′lt aredetermined. The ERLE estimate is then denoted as a function of theERLElt value and the ERLE′lt value.

Another aspect of the present invention is directed to a method ofgenerating an echo return loss (ERL) estimate for a communicationsignal. Pursuant to the method, a first long term ERL value ERLlt and asecond long term ERL value ERL′lt are determined. The ERL estimate isthen denoted as a function of the ERLlt value and the ERL′lt value.

It is understood that other embodiments of the present invention willbecome readily apparent to those skilled in the art from the followingdetailed description, wherein embodiments of the invention are shown anddescribed only by way of illustration of the best modes contemplated forcarrying out the invention. As will be realized, the invention iscapable of other and different embodiments and its several details arecapable of modification in various other respects, all without departingfrom the spirit and scope of the present invention. Accordingly, thedrawings and detailed description are to be regarded as illustrative innature and not as restrictive.

DESCRIPTION OF THE DRAWINGS

These and other features, aspects, and advantages of the presentinvention will become better understood with regard to the followingdescription, appended claims, and accompanying drawings where:

FIG. 1 is a block diagram of a packet-based infrastructure providing acommunication medium with a number of telephony devices in accordancewith a preferred embodiment of the present invention.

FIG. 1A is a block diagram of a packet-based infrastructure providing acommunication medium with a number of telephony devices in accordancewith a preferred embodiment of the present invention.

FIG. 2 is a block diagram of a signal processing system implemented witha programmable digital signal processor (DSP) software architecture inaccordance with a preferred embodiment of the present invention.

FIG. 3 is a block diagram of the software architecture operating on theDSP platform of FIG. 2 in accordance with a preferred embodiment of thepresent invention.

FIG. 4 is a state machine diagram of the operational modes of a virtualdevice driver for packet-based network applications in accordance with apreferred embodiment of the present invention.

FIG. 5 is a block diagram of several signal processing systems in thevoice mode for interfacing between a switched circuit network and apacket-based network in accordance with a preferred embodiment of thepresent invention.

FIG. 6 is a system block diagram of a signal processing system operatingin a voice mode in accordance with a preferred embodiment of the presentinvention.

FIG. 6A is a block diagram, according to one aspect of the presentinvention, of a portion of FIG. 6 which further includes at least oneautomatic gain control which receives information from at least one echocanceller.

FIG. 7 is a representative flow chart, according to one aspect of thepresent invention, for determining a combined loss estimate.

FIGS. 8A and 8B shows a representative sample flow of blocks, andcertain quantities that might be computed in relation to this sampleflow.

FIG. 9 is a representative flow chart, according to one aspect of thepresent invention, for determining a setting for a near end detector.

FIG. 10 is a representative flow chart, according to one aspect of thepresent invention, for determining an ERL estimation.

FIG. 11 is a representative flow chart, according to one aspect of thepresent invention, for updating the ERL estimator.

FIG. 12 is a representative flow chart, according to one aspect of thepresent invention, for determining a first long-term ERL estimate.

FIG. 13 is a representative flow chart, according to one aspect of thepresent invention, for determining an ERLE estimation.

FIG. 14 is a representative flow chart, according to one aspect of thepresent invention, for updating the ERLE estimator.

FIG. 15 is a representative flow chart, according to one aspect of thepresent invention, for determining a first long term ERLE estimate.

FIG. 16 is a representative flow chart, according to one aspect of thepresent invention, for determining a maximum AGC gain.

DETAILED DESCRIPTION An Embodiment of a Signal Processing System

In a preferred embodiment of the present invention, a signal processingsystem is employed to interface telephony devices with packet-basednetworks. Telephony devices include, by way of example, analog anddigital phones, ethernet phones, Internet Protocol phones, fax machines,data modems, cable modems, interactive voice response systems, PBXs, keysystems, and any other conventional telephony devices known in the art.The described preferred embodiment of the signal processing system canbe implemented with a variety of technologies including, by way ofexample, embedded communications software that enables transmission ofinformation, including voice, fax and modem data over packet-basednetworks. The embedded communications software is preferably run onprogrammable digital signal processors (DSPs) and is used in gateways,cable modems, remote access servers, PBXs, and other packet-basednetwork appliances:

An exemplary topology is shown in FIG. 1 with a packet-based network 10providing a communication medium between various telephony devices. Eachnetwork gateway 12 a, 12 b, 12 c includes a signal processing systemwhich provides an interface between the packet-based network 10 and anumber of telephony devices. In the described exemplary embodiment, eachnetwork gateway 12 a, 12 b, 12 c supports a fax machine 14 a, 14 b, 14c, a telephone 13 a, 13 b, 13 c, and a modem 15 a, 15 b, 15 c. As willbe appreciated by those skilled in the art, each network gateway 12 a,12 b, 12 c could support a variety of different telephony arrangements.By way of example, each network gateway might support any numbertelephony devices and/or circuit-switched/packet-based networksincluding, among others, analog telephones, ethernet phones, faxmachines, data modems, PSTN lines (Public Switching Telephone Network),ISDN lines (Integrated Services Digital Network), T1 systems, PBXs, keysystems, or any other conventional telephony device and/orcircuit-switched/packet-based network. In the described exemplaryembodiment, two of the network gateways 12 a, 12 b provide a directinterface between their respective telephony devices and thepacket-based network 10. The other network gateway 12 c is connected toits respective telephony device through a PSTN 19. The network gateways12 a, 12 b, 12 c permit voice, fax and modem data to be carried overpacket-based networks such as PCs running through a USB (UniversalSerial Bus) or an asynchronous serial interface, Local Area Networks(LAN) such as Ethernet, Wide Area Networks (WAN) such as InternetProtocol (IP), Frame Relay (FR), Asynchronous Transfer Mode (ATM),Public Digital Cellular Network such as TDMA (IS-13x), CDMA (IS-9x) orGSM for terrestrial wireless applications, or any other packet-basedsystem.

Another exemplary topology is shown in FIG. 1A. The topology of FIG. 1Ais similar to that of FIG. 1 but includes a second packet-based network16 that is connected to packet-based network 10 and to telephony devices13 b, 14 b and 15 b via network gateway 12 b. The signal processingsystem of network gateway 12 b provides an interface betweenpacket-based network 10 and packet-based network 16 in addition to aninterface between packet-based networks 10, 16 and telephony devices 13b, 14 b and 15 b. Network gateway 12 d includes a signal processingsystem which provides an interface between packet-based network 16 andfax machine 14 d, telephone 13 d, and modem 15 d.

The exemplary signal processing system can be implemented with aprogrammable DSP software architecture as shown in FIG. 2. Thisarchitecture has a DSP 17 with memory 18 at the core, a number ofnetwork channel interfaces 19 and telephony interfaces 20, and a host 21that may reside in the DSP itself or on a separate microcontroller. Thenetwork channel interfaces 19 provide multi-channel access to thepacket-based network. The telephony interfaces 23 can be connected to acircuit-switched network interface such as a PSTN system, or directly toany telephony device. The programmable DSP is effectively hidden withinthe embedded communications software layer. The software layer binds allcore DSP algorithms together, interfaces the DSP hardware to the host,and provides low-level services such as the allocation of resources toallow higher level software programs to run.

An exemplary multi-layer software architecture operating on a DSPplatform is shown in FIG. 3. A user application layer 26 providesoverall executive control and system management, and directly interfacesa DSP server 25 to the host 21 (see to FIG. 2). The DSP server 25provides DSP resource management and telecommunications signalprocessing. Operating below the DSP server layer are a number ofphysical devices (PXD) 30 a, 30 b, 30 c. Each PXD provides an interfacebetween the DSP server 25 and an external telephony device (not shown)via a hardware abstraction layer (HAL) 34.

The DSP server 25 includes a resource manager 24 which receives commandsfrom, forwards events to, and exchanges data with the user applicationlayer 26. The user application layer 26 can either be resident on theDSP 17 or alternatively on the host 21 (see FIG. 2), such as amicrocontroller. An application programming interface 27 (API) providesa software interface between the user application layer 26 and theresource manager 24. The resource manager 24 manages theinternal/external program and data memory of the DSP 17. In addition theresource manager dynamically allocates DSP resources, performs commandrouting as well as other general purpose functions.

The DSP server 25 also includes virtual device drivers (VHDs) 22 a, 22b, 22 c. The VHDs are a collection of software objects that control theoperation of and provide the facility for real time signal processing.Each VHD 22 a, 22 b, 22 c includes an inbound and outbound media queue(not shown) and a library of signal processing services specific to thatVHD 22 a, 22 b, 22 c. In the described exemplary embodiment, each VHD 22a, 22 b, 22 c is a complete self-contained software module forprocessing a single channel with a number of different telephonydevices. Multiple channel capability can be achieved by adding VHDs tothe DSP server 25. The resource manager 24 dynamically controls thecreation and deletion of VHDs and services.

A switchboard 32 in the DSP server 25 dynamically inter-connects thePXDs 30 a, 30 b, 30 c with the VHDs 22 a, 22 b, 22 c. Each PXD 30 a, 30b, 30 c is a collection of software objects which provide signalconditioning for one external telephony device. For example, a PXD mayprovide volume and gain control for signals from a telephony deviceprior to communication with the switchboard 32. Multiple telephonyfunctionalities can be supported on a single channel by connectingmultiple PXDs, one for each telephony device, to a single VI-ID via theswitchboard 32. Connections within the switchboard 32 are managed by theuser application layer 26 via a set of API commands to the resourcemanager 24. The number of PXDs and VHDs is expandable, and limited onlyby the memory size and the MIPS (millions instructions per second) ofthe underlying hardware.

A hardware abstraction layer (HAL) 34 interfaces directly with theunderlying DSP 17 hardware (see FIG. 2) and exchanges telephony signalsbetween the external telephony devices and the PXDs. The HAL 34 includesbasic hardware interface routines, including DSP initialization, targethardware control, codec sampling, and hardware control interfaceroutines. The DSP initialization routine is invoked by the userapplication layer 26 to initiate the initialization of the signalprocessing system. The DSP initialization sets up the internal registersof the signal processing system for memory organization, interrupthandling, timer initialization, and DSP configuration. Target hardwareinitialization involves the initialization of all hardware devices andcircuits external to the signal processing system. The HAL 34 is aphysical firmware layer that isolates the communications software fromthe underlying hardware. This methodology allows the communicationssoftware to be ported to various hardware platforms by porting only theaffected portions of the HAL 34 to the target hardware.

The exemplary software architecture described above can be integratedinto numerous telecommunications products. In an exemplary embodiment,the software architecture is designed to support telephony signalsbetween telephony devices (and/or circuit-switched networks) andpacket-based networks. A network VHD (NetVHD) is used to provide asingle channel of operation and provide the signal processing servicesfor transparently managing voice, fax, and modem data across a varietyof packet-based networks. More particularly, the NetVHD encodes andpacketizes DTMF, voice, fax, and modem data received from varioustelephony devices and/or circuit-switched networks and transmits thepackets to the user application layer. In addition, the NetVHDdisassembles DTMF, voice, fax, and modem data from the user applicationlayer, decodes the packets into signals, and transmits the signals tothe circuit-switched network or device.

An exemplary embodiment of the NetVHD operating in the describedsoftware architecture is shown in FIG. 4. The NetVHD includes fouroperational modes, namely voice mode 36, voiceband data mode 37, faxrelay mode 40, and data relay mode 42. In each operational mode, theresource manager invokes various services. For example, in the voicemode 36, the resource manager invokes call discrimination 44, packetvoice exchange 48, and packet tone exchange 50. The packet voiceexchange 48 may employ numerous voice compression algorithms, including,among others, Linear 128 kbps, G.711 u-law/A-law 64 kbps (ITURecommendation G.711 (1988)—Pulse code modulation (PCM) of voicefrequencies), G.726 16/24/32/40 kbps (ITU Recommendation G.726(12/90)-40, 32, 24, 16 kbit/s Adaptive Differential Pulse CodeModulation (ADPCM)), G.729A 8 kbps (Annex A (11/96) to ITURecommendation G.729—Coding of speech at 8 kbit/s using conjugatestructure algebraic-code-excited linear-prediction (CS-ACELP) B Annex A:Reduced complexity 8 kbit/s CS-ACELP speech codec), and G.723 5.3/6.3kbps (ITU Recommendation G.723.1 (03/96)—Dual rate coder for multimediacommunications transmitting at 5.3 and 6.3 kbit/s). The contents of eachof the foregoing ITU Recommendations being incorporated herein byreference as if set forth in full. The packet voice exchange 48 iscommon to both the voice mode 36 and the voiceband data mode 37. In thevoiceband data mode 37, the resource manager invokes the packet voiceexchange 48 for exchanging transparently data without modification(other than packetization) between the telephony device (orcircuit-switched network) and the packet-based network. This istypically used for the exchange of fax and modem data when bandwidthconcerns are minimal as an alternative to demodulation and remodulation.During the voiceband data mode 37, the human speech detector service 59is also invoked by the resource manager. The human speech detector 59monitors the signal from the near end telephony device for speech. Inthe event that speech is detected by the human speech detector 59, anevent is forwarded to the resource manager which, in turn, causes theresource manager to terminate the human speech detector service 59 andinvoke the appropriate services for the voice mode 36 (i.e., the calldiscriminator, the packet tone exchange, and the packet voice exchange).

In the fax relay mode 40, the resource manager invokes a fax exchange 52service. The packet fax exchange 52 may employ various data pumpsincluding, among others, V.17 which can operate up to 14,400 bits persecond, V.29 which uses a 1700-Hz carrier that is varied in both phaseand amplitude, resulting in 16 combinations of 8 phases and 4 amplitudeswhich can operate up to 9600 bits per second, and V.27ter which canoperate up to 4800 bits per second. Likewise, the resource managerinvokes a packet data exchange 54 service in the data relay mode 42. Thepacket data exchange 52 may employ various data pumps including, amongothers, V.22bis/V.22 with data rates up to 2400 bits per second,V.32bis/V.32 which enables full-duplex transmission at 14,400 bits persecond, and V.34 which operates up to 33,600 bits per second. The ITURecommendations setting forth the standards for the foregoing data pumpsare incorporated herein by reference as if set forth in full.

In the described exemplary embodiment, the user application layer doesnot need to manage any service directly. The user application layermanages the session using high-level commands directed to the NetVHD,which in turn directly runs the services. However, the user applicationlayer can access more detailed parameters of any service if necessary tochange, by way of example, default functions for any particularapplication.

In operation, the user application layer opens the NetVHD and connectsit to the appropriate PXD. The user application then may configurevarious operational parameters of the NetVHD, including, among others,default voice compression (Linear, G.711, G.726, G.723.1, G.723.1A,G.729A, G.729B), fax data pump (Binary, V.17, V.29, V.27ter), and modemdata pump (Binary, V.22bis, V.32bis, V.34). The user application layerthen loads an appropriate signaling service (not shown) into the NetVHD,configures it and sets the NetVHD to the On-hook state.

In response to events from the signaling service (not shown) via a nearend telephony device (hookswitch), or signal packets from the far end,the user application will set the NetVHD to the appropriate off-hookstate, typically voice mode. In an exemplary embodiment, if thesignaling service event is triggered by the near end telephony device,the packet tone exchange will generate dial tone. Once a DTMF tone isdetected, the dial tone is terminated. The DTMF tones are packetized andforwarded to the user application layer for transmission on thepacket-based network. The packet tone exchange could also play ringingtone back to the near end telephony device (when a far end telephonydevice is being rung), and a busy tone if the far end telephony deviceis unavailable. Other tones may also be supported to indicate allcircuits are busy, or an invalid sequence of DTMF digits were entered onthe near end telephony device.

Once a connection is made between the near end and far end telephonydevices, the call discriminator is responsible for differentiatingbetween a voice and machine call by detecting the presence of a 2100 Hz.tone (as in the case when the telephony device is a fax or a modem), a1100 Hz. tone or V.21 modulated high level data link control (HDLC)flags (as in the case when the telephony device is a fax). If a 1100 Hz.tone, or V.21 modulated HDLC flags are detected, a calling fax machineis recognized. The NetVHD then terminates the voice mode 36 and invokesthe packet fax exchange to process the call. If however, 2100 Hz tone isdetected, the NetVHD terminates voice mode and invokes the packet dataexchange.

The packet data exchange service further differentiates between a faxand modem by continuing to monitor the incoming signal for V.21modulated HDLC flags, which if present, indicate that a fax connectionis in progress. If HDLC flags are detected, the NetVHD terminates packetdata exchange service and initiates packet fax exchange service.Otherwise, the packet data exchange service remains operative. In theabsence of an 1100 or 2100 Hz. tone, or V.21 modulated HDLC flags thevoice mode remains operative.

The Voice Mode

Voice mode provides signal processing of voice signals. As shown in theexemplary embodiment depicted in FIG. 5, voice mode enables thetransmission of voice over a packet-based system such as Voice over IP(VoIP, H.323), Voice over Frame Relay (VoFR, FRF-11), Voice Telephonyover ATM (VTOA), or any other proprietary network. The voice mode shouldalso permit voice to be carried over traditional media such as timedivision multiplex (TDM) networks and voice storage and playbacksystems. Network gateway 55 a supports the exchange of voice between atraditional circuit-switched network 58 and packet-based networks 56(a)and 56(b). Network gateways 55 b, 55 c, 55 d, 55 e support the exchangeof voice between packet-based network 56 a and a number of telephonydevices 57 b, 57 c, 57 d, 57 e. In addition, network gateways 55 f, 55g, 55 h, 55 i support the exchange of voice between packet-based network56 b and telephony devices 57 f, 57 g, 57 h, 57 i. Telephony devices 57a, 57 b, 57 c, 57 d, 57 e, 55 f, 55 g, 55 h, 55 i can be any type oftelephony device including telephones, facsimile machines and modems.

The PXDs for the voice mode provide echo cancellation, gain, andautomatic gain control. The network VHD invokes numerous services in thevoice mode including call discrimination, packet voice exchange, andpacket tone exchange. These network VHD services operate together toprovide: (1) an encoder system with DTMF detection, call progress tonedetection, voice activity detection, voice compression, and comfortnoise estimation, and (2) a decoder system with delay compensation,voice decoding, DTMF generation, comfort noise generation and lost framerecovery.

The services invoked by the network VHD in the voice mode and theassociated PXD is shown schematically in FIG. 6. In the describedexemplary embodiment, the PXD 60 provides two way communication with atelephone or a circuit-switched network, such as a PSTN line (e.g. DS0)carrying a 64 kb/s pulse code modulated (PCM) signal, i.e., digitalvoice samples.

The incoming PCM signal 60 a is initially processed by the PXD 60 toremove far end echoes that might otherwise be transmitted back to thefar end user. As the name implies, echoes in telephone systems is thereturn of the talker's voice resulting from the operation of the hybridwith its two-four wire conversion. If there is low end-to-end delay,echo from the far end is equivalent to side-tone (echo from thenear-end), and therefore, not a problem. Side-tone gives users feedbackas to how loud they are talking; and indeed, without side-tone, userstend to talk too loud. However, far end echo delays of more than about10 to 30 msec significantly degrade the voice quality and are a majorannoyance to the user.

An echo canceller 70 is used to remove echoes from far end speechpresent on the incoming PCM signal 60 a before routing the incoming PCMsignal 60 a back to the far end user. The echo canceller 70 samples anoutgoing PCM signal 60 b from the far end user, filters it, and combinesit with the incoming PCM signal 60 a. Preferably, the echo canceller 70is followed by a non-linear processor (NLP) 72 which may mute thedigital voice samples when far end speech is detected in the absence ofnear end speech. The echo canceller 70 may also inject comfort noisewhich in the absence of near end speech may be roughly at the same levelas the true background noise or at a fixed level.

After echo cancellation, the power level of the digital voice samples isnormalized by an automatic gain control (AGC) 74 to ensure that theconversation is of an acceptable loudness. Alternatively, the AGC can beperformed before the echo canceller 70. However, this approach wouldentail a more complex design because the gain would also have to beapplied to the sampled outgoing PCM signal 60 b. In the describedexemplary embodiment, the AGC 74 is designed to adapt slowly, althoughit should adapt fairly quickly if overflow or clipping is detected. TheAGC adaptation should be held fixed if the NLP 72 is activated.

After AGC, the digital voice samples are placed in the media queue 66 inthe network VHD 62 via the switchboard 32′. In the voice mode, thenetwork VHD 62 invokes three services, namely call discrimination,packet voice exchange, and packet tone exchange. The call discriminator68 analyzes the digital voice samples from the media queue to determinewhether a 2100 Hz tone, a 1100 Hz tone or V.21 modulated HDLC flags arepresent. As described above with reference to FIG. 4, if either tone orHDLC flags are detected, the voice mode services are terminated and theappropriate service for fax or modem operation is initiated. In theabsence of a 2100 Hz tone, a 1100 Hz tone, or HDLC flags, the digitalvoice samples are coupled to the encoder system which includes a voiceencoder 82, a voice activity detector (VAD) 80, a comfort noiseestimator 81, a DTMF detector 76, a call progress tone detector 77 and apacketization engine 78.

Typical telephone conversations have as much as sixty percent silence orinactive content. Therefore, high bandwidth gains can be realized ifdigital voice samples are suppressed during these periods. A VAD 80,operating under the packet voice exchange, is used to accomplish thisfunction. The VAD 80 attempts to detect digital voice samples that donot contain active speech. During periods of inactive speech, thecomfort noise estimator 81 couples silence identifier (SID) packets to apacketization engine 78. The SID packets contain voice parameters thatallow the reconstruction of the background noise at the far end.

From a system point of view, the VAD 80 may be sensitive to the changein the NLP 72. For example, when the NLP 72 is activated, the VAD 80 mayimmediately declare that voice is inactive. In that instance, the VAD 80may have problems tracking the true background noise level. If the echocanceller 70 generates comfort noise during periods of inactive speech,it may have a different spectral characteristic from the true backgroundnoise. The VAD 80 may detect a change in noise character when the NLP 72is activated (or deactivated) and declare the comfort noise as activespeech. For these reasons, the VAD 80 should be disabled when the NLP 72is activated. This is accomplished by a “NLP on” message 72 a passedfrom the NLP 72 to the VAD 80.

The voice encoder 82, operating under the packet voice exchange, can bea straight 16 bit PCM encoder or any voice encoder which supports one ormore of the standards promulgated by ITU. The encoded digital voicesamples are formatted into a voice packet (or packets) by thepacketization engine 78. These voice packets are formatted according toan applications protocol and outputted to the host (not shown). Thevoice encoder 82 is invoked only when digital voice samples with speechare detected by the VAD 80. Since the packetization interval may be amultiple of an encoding interval, both the VAD 80 and the packetizationengine 78 should cooperate to decide whether or not the voice encoder 82is invoked. For example, if the packetization interval is 10 msec andthe encoder interval is 5 msec (a frame of digital voice samples is 5ms), then a frame containing active speech should cause the subsequentframe to be placed in the 10 ms packet regardless of the VAD stateduring that subsequent frame. This interaction can be accomplished bythe VAD 80 passing an “active” flag 80 a to the packetization engine 78,and the packetization engine 78 controlling whether or not the voiceencoder 82 is invoked.

In the described exemplary embodiment, the VAD 80 is applied after theAGC 74. This approach provides optimal flexibility because both the VAD80 and the voice encoder 82 are integrated into some speech compressionschemes such as those promulgated in ITU Recommendations G.729 withAnnex B VAD (March 1996)—Coding of Speech at 8 kbits/s UsingConjugate-Structure Algebraic-Code-Exited Linear Prediction (CS-ACELP),and G.723.1 with Annex A VAD (March 1996)—Dual Rate Coder for MultimediaCommunications Transmitting at 5.3 and 6.3 kbit/s, the contents of whichis hereby incorporated by reference as through set forth in full herein.

Operating under the packet tone exchange, a DTMF detector 76 determineswhether or not there is a DTMF signal present at the near end. The DTMFdetector 76 also provides a pre-detection flag 76 a which indicateswhether or not it is likely that the digital voice sample might be aportion of a DTMF signal. If so, the pre-detection flag 76 a is relayedto the packetization engine 78 instructing it to begin holding voicepackets. If the DTMF detector 76 ultimately detects a DTMF signal, thevoice packets are discarded, and the DTMF signal is coupled to thepacketization engine 78. Otherwise the voice packets are ultimatelyreleased from the packetization engine 78 to the host (not shown). Thebenefit of this method is that there is only a temporary impact on voicepacket delay when a DTMF signal is pre-detected in error, and not aconstant buffering delay. Whether voice packets are held while thepre-detection flag 76 a is active could be adaptively controlled by theuser application layer.

Similarly, a call progress tone detector 77 also operates under thepacket tone exchange to determine whether a precise signaling tone ispresent at the near end. Call progress tones are those which indicatewhat is happening to dialed phone calls. Conditions like busy line,ringing called party, bad number, and others each have distinctive tonefrequencies and cadences assigned them. The call progress tone detector77 monitors the call progress state, and forwards a call progress tonesignal to the packetization engine to be packetized and transmittedacross the packet based network. The call progress tone detector mayalso provide information regarding the near end hook status which isrelevant to the signal processing tasks. If the hook status is on hook,the VAD should preferably mark all frames as inactive, DTMF detectionshould be disabled, and SID packets should only be transferred if theyare required to keep the connection alive.

The decoding system of the network VHD 62 essentially performs theinverse operation of the encoding system. The decoding system of thenetwork VHD 62 comprises a depacketizing engine 84, a voice queue 86, aDTMF queue 88, a precision tone queue 87, a voice synchronizer 90, aDTMF synchronizer 102, a precision tone synchronizer 103, a voicedecoder 96, a VAD 98, a comfort noise estimator 100, a comfort noisegenerator 92, a lost packet recovery engine 94, a tone generator 104,and a precision tone generator 105.

The depacketizing engine 84 identifies the type of packets received fromthe host (i.e., voice packet, DTMF packet, call progress tone packet,SID packet), transforms them into frames which are protocol independent.The depacketizing engine 84 then transfers the voice frames (or voiceparameters in the case of HD packets) into the voice queue 86, transfersthe DTMF frames into the DTMF queue 88 and transfers the call progresstones into the call progress tone queue 87. In this manner, theremaining tasks are, by and large, protocol independent.

A jitter buffer is utilized to compensate for network impairments suchas delay jitter caused by packets not arriving at the same time or inthe same order in which they were transmitted. In addition, the jitterbuffer compensates for lost packets that occur on occasion when thenetwork is heavily congested. In the described exemplary embodiment, thejitter buffer for voice includes a voice synchronizer 90 that operatesin conjunction with a voice queue 86 to provide an isochronous stream ofvoice frames to the voice decoder 96.

Sequence numbers embedded into the voice packets at the far end can beused to detect lost packets, packets arriving out of order, and shortsilence periods. The voice synchronizer 90 can analyze the sequencenumbers, enabling the comfort noise generator 92 during short silenceperiods and performing voice frame repeats via the lost packet recoveryengine 94 when voice packets are lost. SID packets can also be used asan indicator of silent periods causing the voice synchronizer 90 toenable the comfort noise generator 92. Otherwise, during far end activespeech, the voice synchronizer 90 couples voice frames from the voicequeue 86 in an isochronous stream to the voice decoder 96. The voicedecoder 96 decodes the voice frames into digital voice samples suitablefor transmission on a circuit switched network, such as a 64 kb/s PCMsignal for a PSTN line. The output of the voice decoder 96 (or thecomfort noise generator 92 or lost packet recovery engine 94 if enabled)is written into a media queue 106 for transmission to the PXD 60.

The comfort noise generator 92 provides background noise to the near enduser during silent periods. If the protocol supports SID packets, (andthese are supported for VTOA, FRF-11, and VoIP), the comfort noiseestimator at the far end encoding system should transmit SID packets.Then, the background noise can be reconstructed by the near end comfortnoise generator 92 from the voice parameters in the SID packets bufferedin the voice queue 86. However, for some protocols, namely, FRF-11, theSID packets are optional, and other far end users may not support SIDpackets at all. In these systems, the voice synchronizer 90 mustcontinue to operate properly. In the absence of SID packets, the voiceparameters of the background noise at the far end can be determined byrunning the VAD 98 at the voice decoder 96 in series with a comfortnoise estimator 100.

Preferably, the voice synchronizer 90 is not dependent upon sequencenumbers embedded in the voice packet. The voice synchronizer 90 caninvoke a number of mechanisms to compensate for delay jitter in thesesystems. For example, the voice synchronizer 90 can assume that thevoice queue 86 is in an underflow condition due to excess jitter andperform packet repeats by enabling the lost frame recovery engine 94.Alternatively, the VAD 98 at the voice decoder 96 can be used toestimate whether or not the underflow of the voice queue 86 was due tothe onset of a silence period or due to packet loss. In this instance,the spectrum and/or the energy of the digital voice samples can beestimated and the result 98 a fed back to the voice synchronizer 90. Thevoice synchronizer 90 can then invoke the lost packet recovery engine 94during voice packet losses and the comfort noise generator 92 duringsilent periods.

When DTMF packets arrive, they are depacketized by the depacketizingengine 84. DTMF frames at the output of the depacketizing engine 84 arewritten into the DTMF queue 88. The DTMF synchronizer 102 couples theDTMF frames from the DTMF queue 88 to the tone generator 104. Much likethe voice synchronizer, the DTMF synchronizer 102 is employed to providean isochronous stream of DTMF frames to the tone generator 104.Generally speaking, when DTMF packets are being transferred, voiceframes should be suppressed. To some extent, this is protocol dependent.However, the capability to flush the voice queue 86 to ensure that thevoice frames do not interfere with DTMF generation is desirable.Essentially, old voice frames which may be queued are discarded whenDTMF packets arrive. This will ensure that there is a significant gapbefore DTMF tones are generated. This is achieved by a “tone present”message 88 a passed between the DTMF queue and the voice synchronizer90.

The tone generator 104 converts the DTMF signals into a DTMF tonesuitable for a standard digital or analog telephone. The tone generator104 overwrites the media queue 106 to prevent leakage through the voicepath and to ensure that the DTMF tones are not too noisy.

There is also a possibility that DTMF tone may be fed back as an echointo the DTMF detector 76. To prevent false detection, the DTMF detector76 can be disabled entirely (or disabled only for the digit beinggenerated) during DTMF tone generation. This is achieved by a “tone on”message 104 a passed between the tone generator 104 and the DTMFdetector 76. Alternatively, the NLP 72 can be activated while generatingDTMF tones.

When call progress tone packets arrive, they are depacketized by thedepacketizing engine 84. Call progress tone frames at the output of thedepacketizing engine 84 are written into the call progress tone queue87. The call progress tone synchronizer 103 couples the call progresstone frames from the call progress tone queue 87 to a call progress tonegenerator 105. Much like the DTMF synchronizer, the call progress tonesynchronizer 103 is employed to provide an isochronous stream of callprogress tone frames to the call progress tone generator 105. And muchlike the DTMF tone generator, when call progress tone packets are beingtransferred, voice frames should be suppressed. To some extent, this isprotocol dependent. However, the capability to flush the voice queue 86to ensure that the voice frames do not interfere with call progress tonegeneration is desirable. Essentially, old voice frames which may bequeued are discarded when call progress tone packets arrive to ensurethat there is a significant inter-digit gap before call progress tonesare generated. This is achieved by a “tone present” message 87 a passedbetween the call progress tone queue 87 and the voice synchronizer 90.

The call progress tone generator 105 converts the call progress tonesignals into a call progress tone suitable for a standard digital oranalog telephone. The call progress tone generator 105 overwrites themedia queue 106 to prevent leakage through the voice path and to ensurethat the call progress tones are not too noisy.

The outgoing PCM signal in the media queue 106 is coupled to the PXD 60via the switchboard 32′. The outgoing PCM signal is coupled to anamplifier 108 before being outputted on the PCM output line 60 b.

The outgoing PCM signal in the media queue 106 is coupled to the PXD 60via the switchboard 32′. The outgoing PCM signal is coupled to anamplifier 108 before being outputted on the PCM output line 60 b.

Adaptive Gain Control Using Information from Echo Canceller

An echo canceller is used in many communications systems to estimate andeliminate the effect of losses across a transmission path. Systems willalso generally use an automatic gain control (AGC) device in order toincrease the gain and compensate for such losses across the transmissionpath. The present invention utilizes certain information (or statistics)pertaining to the signal that can be derived from the echo canceller.These statistics might be determined before or after the echo cancelingoperation. This information is then fed forward to an AGC in the signalpath, and the maximum gain allowable can be increased accordingly. Thepresent invention is meant to be generally applicable to any systemwhich might use an echo canceller with AGC type devices. Additionally,other types of devices in the signal path, (i.e., other than an echocanceller), might also be used to supply information to the AGC in orderto adjust the gain. The particular configuration and usage taught andsuggested by the present invention would be useful for any system, as itprovides a maximized signal gain, but maintains a stable system.

In terms of the present exemplary embodiment of a communications system,the elements of FIG. 6 can be further modified to include the elementsof FIG. 6A. The elements of FIG. 6A (shown generally as 200) are meantto replace (in part, or in whole) those elements shown in block 60 ofFIG. 6. For instance, switchboard 32′ is shown interacting with block60. An ingress signal, or PCM in 202, is also shown represented as Sin(i.e., signal in), and enters a first “near end” echo canceller 204. Thenear end echo canceller 204 also receives the resulting output (Rout)signal via the connection 203. The signal from the echo canceller 204 isshown as Sout (i.e., signal out). Sout next enters a non-linearprocessor (NLP) 206, which can supply an NLP “on” indication for use byother devices in the system. The ingress AGC 208 receives the signalresulting from the NLP 206. Additionally, the AGC is shown receiving acombined loss estimate 212 from the echo canceller 204.

The egress signal from the switchboard is shown first entering a “farend” echo canceller 214. The far end echo canceller 214 also receivesthe ingress signal after the AGC via the connection 213. Thereafter thesignal enters an NLP 216. An egress AGC 218 thereafter receives theoutput from the NLP. The far end echo canceller 214 is shown to providea combined loss estimate 220 to be used by the AGC in determining amaximum, yet stable, gain that might be applied to the egress signal.After the AGC 218, the Rout signal is also shown to be the PCM outsignal 201.

Note that this figure shows an embodiment of an echo canceller and anAGC being used at both the ingress point and the egress point. Eitherset of devices may or may not present. For example, the system mightonly contain the near end echo canceller 204 and an ingress AGC 208.Conversely, the system might only contain the far end echo canceller 214and an associated egress AGC 218.

One aspect of the present invention is to use the combined lossestimate, from either echo canceller, in the associated automatic gaincontrol device. This allows the AGC to use a higher gain while stillguaranteeing the overall stability of the system. This combined loss fora near end echo canceller is the loss in power from PCM out 201 to thesignal 205 (Sout) after the echo canceller 204, in the absence of a nearend signal. The combine loss estimate is also referred to herein asAcom, and is similarly referred to in the ITU-T standard G.168. The echolevel after the echo canceller 204 is the far end level (PCM out) minusthe combined loss.

Conversely, for the far end echo canceller, the combined loss is theloss between the signal 209 which exists after the near end AGC 208, tothe signal 215 which exists after the far end echo canceller 214.

This combined loss needs to be estimated by the echo canceller (eitherthe ingress and/or egress echo canceller) in order to facilitate thisinvention. The combined loss estimate 212 (or 220) might be estimated ina variety of ways. One important aspect, however, is to obtain arelatively accurate estimate.

Accordingly, one representative embodiment of the Acom estimator isillustrated by the representative steps 700 as shown in FIG. 7. Thecombined loss estimate (Acom) is found by first estimating the echoreturn loss (ERL), as shown in step 702. Next the echo return lossenhancement (ERLE) is determined in step 704. Step 706 shows thecombined loss estimate (Acom) being determined as the sum of the derivedquantities ERL and ERLE.

The combined loss estimator is based upon a number of power estimators.For the present example, these include but are not limited to short termblock powers, windowed powers, peak powers, and a near end detector. TheERL estimation and the ERLE estimation are then performed, with thecombined loss being the combination of the two.

Referring now to FIG. 8A, a flow of samples is shown represented as ahorizontal axis 802. A short term block power 804 is denoted as B(n).According to this example, the block power consists of the sum of thesquares of the sample values over the most recent block of samples. Inthis representative example, the recent block includes 40 samples, or a5 msec block at a sampling rate of 8 kHz.

On a similar sample flow axis, FIG. 8B next demonstrates a windowedpower W(n) 820. The windowed power is the sum of the block powers over acertain number X of current blocks and previous blocks. For the purposesof the present example, X will represent seven blocks (six previousblocks and the current block). Therefore, since each block is five mseclong, the window power is the power over a 35 msec sliding rectangularwindow.

Referring again to FIG. 8A, the peak power P(n) 806 is the peak in B(n)804 over the tail length of the echo canceller. For instance, for a 64msec echo canceller, P(n) is then the largest block power over the last65 msec of the sample flow, or over the last 13 blocks of B(n).

A near end detector is also used in the present example. FIG. 9 shows aset of representative steps 900 that might be used in setting the nearend detector. As shown in step 902, the window power of the far end(Rout) is compared to −36 dBm. If the window power is greater, then afurther inquiry is made in step 904 as to whether the near end windowpower after the canceller (i.e., Sout) is greater than the peak far endpower. If Sout is greater than the peak far end power, then yet anotherinquiry is made in step 906 as to whether the window power after theecho canceller is within 3 dB of the window power before the canceller.In equation form, the inquiry becomes whether Wsout(n)>Wsout(n)−3 dB. Ifeach of these conditions are satisfied, then step 908 shows the near enddetector being set to a maximum value, i.e., 250 msec. If none of theseconditions are collectively satisfied, then step 910 shows an inquiry asto whether there is a tonal signal on the egress (Rout) path. If yes,then step 912 shows the hangover count being set to a maximum value,i.e. 250 msec. The hangover count is generally an amount of time used invoice processing after a last speech frame. If no tonal signal isdetected on the egress path, then an inquiry is made in step 914 as towhether the hangover counter is greater than zero. If yes, then step 916shows the hangover counter being decremented. Note that while the valuesof −36 dBm, 3 dB, and 250 msec have been used to facilitate the workingexample above, the present invention is not intended to be limited tothese particular values. In an alternative embodiment of the presentinvention, there are no steps 910 and 912. Thus, if any of theconditions in steps 902, 904 or 906 are not satisfied, then the inquiryis made in step 914 as to whether the hangover counter is greater thanzero, regardless of whether there is a tonal signal on the egress path.

FIG. 10 next shows certain representative steps 1000 which might be usedin determining the ERL estimation (represented as step 702 in FIG. 7).The ERL estimators are updated when a certain set of conditions aresatisfied. Step 1002 first inquires if the near end hangover count isequal to zero. If yes, then step 1004 next inquires whether the peak farend power is greater than a set amount, i.e. −36 dBm. If yes, then step1006 next inquires whether the peak far end power is greater than thenear end window power (i.e., W sin(n)). If these collective conditionshave been satisfied, then step 1008 shows the process of updating theERL estimate.

FIG. 11 next shows certain representative steps 1100 that might be usedin association with updating the ERL estimator. Step 1102 first showsthe process of determining a long-term ERL level. This is furtherdetailed in step 1104 wherein the long term level is defined as thewindow power that has been filtered through a first order InfiniteImpulse Response (IIR) filter using a set coefficient. For the presentexample, the coefficient used is approximately equal to 0.96875. Thislong term level is denoted by L(n).

Step 1106 next shows the process of computing a short term ERL estimatedenoted as ERLst. Block 1108 further details this value, which is equalto the long term level at the egress point (i.e., Lrout(n)) minus thelong term level as the signal-in point (i.e., L sin(n)). In other words,the equation becomes ERLst=Lrout(n)−L sin(n). Step 1110 next shows theprocess of determining a first longer term ERL estimate, which isdenoted as ERLlt(n). Further details of representative steps 1100 areshown in FIG. 12.

In an illustrative embodiment of the present invention, the echocancellers 204 and 214 each include a primary canceller and a secondarycanceller. When the echo canceller 204, 214 adapts it adapts with asecondary (background) set of coefficients. When the echo canceller 204,214 decides this secondary set of coefficients is better than theprimary set, it copies the coefficients from the secondary canceller tothe primary canceller. When the echo canceller 204, 214 updates thecoefficients, it also updates its ERL and ERLE estimates. In otherwords, there are two copies of the ERL and ERLE estimates. The primaryestimate (which is denoted herein as ERL′lt and ERLE′lt) is the estimateof the ERL (or ERLE) that was determined the last time the coefficientswere updated. The secondary estimate (which is denoted herein as ERLltand ERLElt) is the current one being updated as explained above withrespect to, e.g., FIGS. 10 and 11.

FIG. 12 shows further details of representative steps 1100 of FIG. 11.Block 1202 inquires whether the short term ERL is greater than the longterm ERL. If the short term ERL is not greater than the long term ERL,then step 1204 shows ERLlt(n) being equal to the short term estimatefiltered through a first order IIR filter having a first coefficient.For this example, this first coefficient is set to approximately0.96875. If the short term ERL is greater than the long term ERL, thenstep 1206 shows ERLlt(n) being equal to the short term estimate filteredthrough a first order IIR filter having a second coefficient. Again forthis example, the second coefficient is set to approximately 0.875. Instep 1208, a second long term ERL, denoted as ERL′lt, is determined.ERL′lt is the ERL the last time the coefficients of the echo cancellerwere updated. At step 1210, the ERL estimate is denoted as the larger ofthe two values of ERLlt and ERL′lt.

Referring again to FIG. 11, step 1112 shows the process of determining asecond long term ERL estimate, which is denoted as ERLc. This value isestimated by summing the squares of the echo canceller transversalfilter coefficients. Thereafter step 1114 shows that the estimate of ERLis denoted as the larger of the two terms ERLlt and ERLc.

FIG. 13 illustrates certain representative steps 1300 that might beassociated with determining the ERLE estimation, as per step 704 of FIG.7. The ERLE estimators are updated when certain conditions aresatisfied. Namely, step 1302 inquires whether the near end hangovercount is zero. If yes, then a subsequent inquiry is shown in step 1304,wherein it is determined if the peak far end is at least 15 dB greaterthan the window power after the echo canceller, i.e. the power at Sout.If both of these conditions are satisfied, then step 1306 shows theprocess of updating the ERLE estimate.

FIG. 14 next shows certain representative steps 1400 that might be usedin updating the ERLE estimate, as per step 1306 above. Step 1402 showsthe process of determining the long-term ERLE level, which is denoted asL′(n). Step 1404 shows the process of determining the short-term ERLEestimate, which is denoted as ERLEst(n), by taking the long term ERLElevel for the signal in (L′ sin(n)) and subtracting the long term ERLElevel at the signal out (L′sout(n)). The equation then becomes ERLEst=L′sin(n)−L′sout(n). Step 1406 shows the process of determining the firstlong term ERLE estimate, denoted ERLElt.

FIG. 15 next details certain representative steps 1500 associated withdetermining this first long term ERLE estimate. This long term estimateis updated upon the completion of certain conditions. Step 1502 shows aninquiry as to whether the far end window power (Wrout(n)) is greaterthan a certain level. In the present example, this level is set toapproximately −36 dBm. If Wrout(n) is greater than −36 dBm, then block1504 shows the process of updating the long term ERLE estimate. Step1506 next inquires whether the short term ERLE is greater than the longterm ERLE. If no, then step 1508 shows the value of ERLElt beingdetermined as the short term estimate filtered through a first order IIRfilter having a first coefficient. According to this present example,this first coefficient is set to approximately 0.96875. If the long termERLE is greater than the short term ERLE, then step 1510 shows the valueof ERLElt being determined as the short term estimate filtered through afirst order IIR filter having a second coefficient. As per the presentexample, this second coefficient is set to approximately 0.875.

Referring again to FIG. 14, the secondary long term ERLE, denoted asERLE′lt, is defined as the ERLE the last time the coefficients of theecho cancellers were updated. Step 1410 shows that the estimate ERLE isdenoted as the larger of the two values ERLElt and ERLE′lt. Referringagain to FIG. 7, step 706 computes the combined loss estimate to beAcom=ERL+ERLE, each of which might be determined via the steps above.

Still another series of steps might be employed to limit the maximum AGCgain. FIG. 16 shows the process of determining a maximum AGC gain. Instep 1602, a certain offset is subtracted from Acom. For the presentexample, this offset is approximately 6 dB. Step 1604 next inquireswhether this quantity (Acom−offset) is less than a certain gain limit.If yes, then the maximum AGC gain is set to (Acom−offset). If no, thenthe AGC gain is set to the gain limit. For the purposes of the presentexample, this gain limit is set to approximately 24 dB. Accordingly, byapplying this last series of steps, the gain can be limited to a levelthat will provide a stable system.

Although a preferred embodiment of the present invention has beendescribed, it should not be construed to limit the scope of the appendedclaims. For example, the present invention is applicable to anyreal-time media, such as audio and video, in addition to the voice mediaillustratively described herein. Also, the invention is applicable tothe recovery of any type of lost data elements, such as packets, inaddition to the application to late frames described herein. Thoseskilled in the art will understand that various modifications may bemade to the described embodiment. Moreover, to those skilled in thevarious arts, the invention itself herein will suggest solutions toother tasks and adaptations for other applications. It is thereforedesired that the present embodiments be considered in all respects asillustrative and not restrictive, reference being made to the appendedclaims rather than the foregoing description to indicate the scope ofthe invention.

What is claimed is:
 1. A system for processing communication signals,the system comprising: a near-end echo canceller, in an outgoing signalpath of the system, operable to generate information pertaining to anoutgoing communication signal; an outgoing gain controller, in theoutgoing signal path, operable to adjust a gain of the outgoingcommunication signal based on the information pertaining to the outgoingcommunication signal; a far-end echo canceller, in an incoming signalpath of the system, operable to generate information pertaining to anincoming communication signal; and an incoming gain controller, in theincoming signal path, operable to adjust a gain of the incomingcommunication signal based on the information pertaining to the incomingcommunication signal.
 2. The system of claim 1, wherein the informationpertaining to the outgoing communication signal includes statisticalperformance information from the near-end echo canceller.
 3. The systemof claim 2, wherein the statistical performance information includes aloss estimate.
 4. The system of claim 3, wherein the loss estimateincludes an echo return loss component.
 5. The system of claim 3,wherein the loss estimate includes an echo return loss enhancementcomponent.
 6. The system of claim 1, wherein the information pertainingto the incoming communication signal includes statistical performanceinformation from the far-end echo canceller.
 7. The system of claim 6,wherein the statistical performance information includes a lossestimate.
 8. The system of claim 7, wherein the loss estimate includesan echo return loss component.
 9. The system of claim 7, wherein theloss estimate includes an echo return loss enhancement component. 10.The system of claim 1, wherein a connection between the far-end echocanceller and the incoming gain controller is a feedforward connection.11. A system for processing communication signals, the systemcomprising: one or more processors operable to: reduce near-end echo inan outgoing communication signal; generate information pertaining to theoutgoing communication signal; adjust a gain of the outgoingcommunication signal based on the information pertaining to the outgoingcommunication signal; reduce far-end echo in an incoming communicationsignal; generate information pertaining to the incoming communicationsignal; and adjust a gain of the incoming communication signal based onthe information pertaining to the incoming communication signal.
 12. Thesystem of claim 11, wherein the information pertaining to the outgoingcommunication signal includes statistical performance information. 13.The system of claim 12, wherein the statistical performance informationincludes a loss estimate.
 14. The system of claim 13, wherein the lossestimate includes an echo return loss component.
 15. The system of claim13, wherein the loss estimate includes an echo return loss enhancementcomponent.
 16. The system of claim 11, wherein the informationpertaining to the incoming communication signal includes statisticalperformance information.
 17. The system of claim 16, wherein thestatistical performance information includes a loss estimate.
 18. Thesystem of claim 17, wherein the loss estimate includes an echo returnloss component.
 19. The system of claim 17, wherein the loss estimateincludes an echo return loss enhancement component.
 20. The system ofclaim 11, wherein the one or more processors are operable to feed theinformation pertaining to the incoming communication signal forward toadjust the gain of the incoming communication signal.
 21. A method forprocessing communication signals, the method comprising: reducingnear-end echo in an outgoing communication signal; generating, with anear-end echo canceller, information pertaining to the outgoingcommunication signal; adjusting, with an outgoing gain controller, again of the outgoing communication signal based on the informationpertaining to the outgoing communication signal; reducing far-end echoin an incoming communication signal; generating, with a far-end echocanceller, information pertaining to the incoming communication signal;and adjusting, with an incoming gain controller, a gain of the incomingcommunication signal based on the information pertaining to the incomingcommunication signal.
 22. The method of claim 21, wherein theinformation pertaining to the outgoing communication signal includesstatistical performance information from the near-end echo canceller.23. The method of claim 22, wherein the statistical performanceinformation includes a loss estimate.
 24. The method of claim 23,wherein the loss estimate includes an echo return loss component. 25.The method of claim 23, wherein the loss estimate includes an echoreturn loss enhancement component.
 26. The method of claim 21, whereinthe information pertaining to the incoming communication signal includesstatistical performance information from the far-end echo canceller. 27.The method of claim 26, wherein the statistical performance informationincludes a loss estimate.
 28. The method of claim 27, wherein the lossestimate includes an echo return loss component.
 29. The method of claim27, wherein the loss estimate includes an echo return loss enhancementcomponent.
 30. The method of claim 21, comprising feeding theinformation pertaining to the incoming communication signal forward toadjust the gain of the incoming communication signal.